RcometsAnalytics supports all cohort-specific analyses of the COMETS
consortium. This collaborative work is done via the COMETS harmonization
group activities. For more information, see the COMETS website.
This vignette demonstrates using the RcometsAnalytics R package from the
command line, while
the tutorial
demonstrates using RcometsAnalytics from the GUI. Documentation of the
RcometsAnalytics R package can be found here manual.
Each project could create their own vignette to run the analyses.
The required input file should be in excel format with the following 6 sheets:
Complete documentation of the various sheets can be found in the package documentation. An example input file is available HERE.
Only empty cells in any excel sheet become missing values when the R software reads the sheet into a data frame. Any non-numeric value in a cell for a continuous variable will result in an error.
The first step of the analysis is to load in the data with the
readCOMETSinput() function.
Input for this function is an Excel spreadsheet, per the description
above.
# Retrieve the full path of the input data
dir <- system.file("extdata", package="RcometsAnalytics", mustWork=TRUE)
csvfile <- file.path(dir, "cometsInputAge.xlsx")
# Read in and process the input data
exmetabdata <- RcometsAnalytics::readCOMETSinput(csvfile)
## VarMap sheet is read in.
## Metabolites sheet is read in.
## SubjectMetabolites sheet is read in.
## SubjectData sheet is read in.
## Models sheet is read in.
## Model_Types sheet is read in.
## There are 16 categorical variables.
## Running Integrity Check...
## Joining, by = "hmdb_id"
## Begin testing models in Models sheet...
## Filtering subjects according to the rule(s) age< 70. 836 of 1000 are retained.
## Warning in runModel.addRemVars(rem.obj, vars[oneVal], vars.type, "too few unique
## non-missing values", : The variable(s) female, fasted have been removed from
## adjvars because of: too few unique non-missing values
## Warning in runModel.addRemVars(rem.obj, tmp[rem], varSet, "correlated with
## another predictor", : The variable(s) multivitamin.2 have been removed from
## adjvars because of: correlated with another predictor
## Finished testing models in Models sheet.
To plot some the distribution of variances for each metabolite:
RcometsAnalytics::plotVar(exmetabdata,titlesize=12)
## Warning: The titlefont attribute is deprecated. Use title = list(font = ...)
## instead.
To plot the distribution of minimum/missing values:
RcometsAnalytics::plotMinvalues(exmetabdata,titlesize=12)
## Warning: The titlefont attribute is deprecated. Use title = list(font = ...)
## instead.
There are 2 ways to specify your model, batch or interactive. In Batch mode, models are specified in your input file Models sheet. The model information needs to be read in with the function getModelData() and processed so the software knows which models to run. The following call defines the “1 Gender adjusted” model from the Models sheet in the input file to be run.
exmodeldata <- RcometsAnalytics::getModelData(exmetabdata,modlabel="1 Gender adjusted")
In Interactive mode, models are specified as parameters. The model
information needs to be read in with the function
getModelData() and processed so the software knows which models
to run.
The following call defines the model with age and bmi_grp as the
exposure variables, and includes only the subjects with age > 40 and
bmi_grp > 2.
exmodeldata2 <- RcometsAnalytics::getModelData(exmetabdata, modelspec="Interactive",
exposures=c("age","bmi_grp"), where=c("age>40","bmi_grp>2"))
## Filtering subjects according to the rule(s) age>40 AND bmi_grp>2. 279 of 1000 are retained.
The runModel() function is the main function for running a single model, and by default, a correlation analysis is performed. The string “DPP” is a label for the cohort which will appear under the “cohort” column in the output.
excorrdata <- RcometsAnalytics::runModel(exmodeldata2,exmetabdata,"DPP")
The output of the correlation analysis can then be compiled and output to an Excel file with the following function:
RcometsAnalytics::OutputListToExcel(filename="DPP_corr.xlsx", excorrdata)
To view the first 3 lines of the correlation analysis output, simply type:
RcometsAnalytics::showModel(excorrdata,nlines=3)
##
## ModelSummary:
## run outcomespec exposurespec term nobs message adjvars
## 1 1 _1_2_3_benzenetriol_sulfate_2 age 279
## 2 2 _1_2_dipalmitoylglycerol age 279
## 3 3 _1_2_propanediol age 279
## adjvars.removed adjspec outcome_uid outcome
## 1 CHEM100006374 1,2,3-benzenetriol sulfate (2)
## 2 HMDB07098 DG(16:0/16:0/0:0)
## 3 HMDB01881 Propylene glycol
## exposure_uid exposure adj_uid metabolite_name
## 1 age Age at Entry 1,2,3-benzenetriol sulfate (2)
## 2 age Age at Entry 1,2-dipalmitoylglycerol
## 3 age Age at Entry 1,2-propanediol
##
## Effects:
## run outcomespec exposurespec term estimate pvalue
## 1 1 _1_2_3_benzenetriol_sulfate_2 age age 0.164624501 0.005846722
## 2 2 _1_2_dipalmitoylglycerol age age 0.068903188 0.251337451
## 3 3 _1_2_propanediol age age 0.001667259 0.977882521
## metabolite_name
## 1 1,2,3-benzenetriol sulfate (2)
## 2 1,2-dipalmitoylglycerol
## 3 1,2-propanediol
##
## Errors_Warnings:
## [1] type object message
## <0 rows> (or 0-length row.names)
##
## Table1:
## variable in.model type category n n.unique min quartile1 median
## 1 age exposure continuous 279 20 55 59 62
## 2 bmi_grp exposure categorical 3 269 NA NA NA NA
## 3 4 10 NA NA NA NA
## mean quartile3 max n.missing
## 1 62.68459 66 74 0
## 2 NA NA NA NA
## 3 NA NA NA NA
##
## Info:
## name value
## 1 date 2022-10-21 13:10:02
## 2 cohort DPP
## 3 RcometsAnalytics version 2.0.6.0
## NULL
To display the heatmap of the resulting correlation matrix, use the showheatmap function.
RcometsAnalytics::showHeatmap(excorrdata)
To display the hierarchical clustering of the resulting correlation
matrix, use the showHClust function. This diplay requires at least 2
rows and 2 columns in the correlation matrix.
exmodeldata<-RcometsAnalytics::getModelData(exmetabdata,modelspec = "Interactive",exposures = c("bmi_grp","age"))
excorrdata <- RcometsAnalytics::runModel(exmodeldata,exmetabdata,"DPP")
RcometsAnalytics::showHClust(excorrdata, showticklabels=FALSE)
Results can be written to an output Excel file with the following command:
RcometsAnalytics::OutputListToExcel("Model1.xlsx", excorrdata)
A stratified correlation analysis can be performed by specifiying stratification variables in the call to getModelData(). If more than one stratification variable is specified, then the strata will be defined by all unique combinations of the stratification variables. The following call will define a model stratified by race_grp.
exmodeldata2 <- RcometsAnalytics::getModelData(exmetabdata,modelspec="Interactive",
outcomes=c("lactose","lactate"),
exposures=c("age","bmi_grp"),strvars="race_grp")
The stratified correlation analysis is run by calling the runModel() function.
excorrdata2 <- RcometsAnalytics::runModel(exmodeldata2,exmetabdata,"DPP")
Call getModelData() to define a model which adjusts for age group, has lactose and lactate as outcome variables, and has age and bmi group as the exposure variables.
exmodeldata <- RcometsAnalytics::getModelData(exmetabdata,modelspec="Interactive", adjvars="age_grp",
outcomes=c("lactose","lactate"), exposures=c("age","bmi_grp"))
To run a linear regression using the lm function, a list of options must be passed into runModel() with the model option set to “lm”.
lm_results <- RcometsAnalytics::runModel(exmodeldata, exmetabdata, "DPP", op=list(model="lm"))
print(lm_results)
## $ModelSummary
## run outcomespec exposurespec term wald.pvalue nobs message
## 1 1 lactose age age 0.164512670 1000
## 2 1 lactose age age_grp 0.202526776 1000
## 3 2 lactate age age 0.601499033 1000
## 4 2 lactate age age_grp 0.983098762 1000
## 5 3 lactose bmi_grp bmi_grp 0.023879815 1000
## 6 3 lactose bmi_grp age_grp 0.000878276 1000
## 7 4 lactate bmi_grp bmi_grp 0.001662820 1000
## 8 4 lactate bmi_grp age_grp 0.620725533 1000
## adjvars adjvars.removed adjspec outcome_uid outcome
## 1 age_grp.2 age_grp.3 age_grp HMDB00186 Alpha-Lactose
## 2 age_grp.2 age_grp.3 age_grp HMDB00186 Alpha-Lactose
## 3 age_grp.2 age_grp.3 age_grp HMDB00190 L-Lactic acid
## 4 age_grp.2 age_grp.3 age_grp HMDB00190 L-Lactic acid
## 5 age_grp.2 age_grp.3 age_grp HMDB00186 Alpha-Lactose
## 6 age_grp.2 age_grp.3 age_grp HMDB00186 Alpha-Lactose
## 7 age_grp.2 age_grp.3 age_grp HMDB00190 L-Lactic acid
## 8 age_grp.2 age_grp.3 age_grp HMDB00190 L-Lactic acid
## exposure_uid exposure adj_uid metabolite_name
## 1 age Age at Entry age_grp.2 age_grp.3 lactose
## 2 age Age at Entry age_grp.2 age_grp.3 lactose
## 3 age Age at Entry age_grp.2 age_grp.3 lactate
## 4 age Age at Entry age_grp.2 age_grp.3 lactate
## 5 bmi_grp bmi_grp age_grp.2 age_grp.3 lactose
## 6 bmi_grp bmi_grp age_grp.2 age_grp.3 lactose
## 7 bmi_grp bmi_grp age_grp.2 age_grp.3 lactate
## 8 bmi_grp bmi_grp age_grp.2 age_grp.3 lactate
##
## $Effects
## run outcomespec exposurespec term estimate std.error statistic
## 1 1 lactose age age 0.034697534 0.024961296 1.3900534
## 2 2 lactate age age -0.002083854 0.003990177 -0.5222460
## 3 3 lactose bmi_grp bmi_grp.2 -0.047905160 0.134163189 -0.3570663
## 4 3 lactose bmi_grp bmi_grp.3 0.360610471 0.145954081 2.4707118
## 5 3 lactose bmi_grp bmi_grp.4 0.420224133 0.577141138 0.7281133
## 6 4 lactate bmi_grp bmi_grp.2 0.034207637 0.021367743 1.6009008
## 7 4 lactate bmi_grp bmi_grp.3 0.080743228 0.023245640 3.4734783
## 8 4 lactate bmi_grp bmi_grp.4 -0.126241151 0.091919426 -1.3733892
## pvalue estimate.lower estimate.upper metabolite_name
## 1 0.1648232433 -0.014225708 0.08362078 lactose
## 2 0.6016151626 -0.009904458 0.00573675 lactate
## 3 0.7211179261 -0.310860179 0.21504986 lactose
## 4 0.0136512756 0.074545728 0.64667521 lactose
## 5 0.4667157218 -0.710951712 1.55139998 lactose
## 6 0.1097166272 -0.007672369 0.07608764 lactate
## 7 0.0005359045 0.035182610 0.12630385 lactate
## 8 0.1699410577 -0.306399914 0.05391761 lactate
##
## $Errors_Warnings
## [1] type object message
## <0 rows> (or 0-length row.names)
##
## $Table1
## variable in.model type category n n.unique min quartile1 median
## 1 age exposure continuous 1000 20 55 59 63
## 2 bmi_grp exposure categorical 1 351 NA NA NA NA
## 3 2 370 NA NA NA NA
## 4 3 269 NA NA NA NA
## 5 4 10 NA NA NA NA
## 6 age_grp adjustment categorical 1 271 NA NA NA NA
## 7 2 565 NA NA NA NA
## 8 3 164 NA NA NA NA
## mean quartile3 max n.missing
## 1 63.208 67 74 0
## 2 NA NA NA NA
## 3 NA NA NA NA
## 4 NA NA NA NA
## 5 NA NA NA NA
## 6 NA NA NA NA
## 7 NA NA NA NA
## 8 NA NA NA NA
##
## $Info
## name
## 1 date
## 2 cohort
## 3 RcometsAnalytics version
## 4 R version
## 5 operating system
## 6 input file
## 7 model name
## 8 run type
## 9 op$check.cor.method
## 10 op$check.cor.cutoff
## 11 op$check.nsubjects
## 12 op$max.nstrata
## 13 op$model
## 14 op$output.Effects
## 15 op$output.ModelSummary
## 16 op$output.ci_alpha
## 17 op$output.metab.cols
## 18 op$output.type
## 19 op$output.merge
## 20 op$chemEnrich
## 21 op$chemEnrich.adjPvalue
## 22 model.options$family
## 23 outcome
## 24 exposure
## 25 strata
## value
## 1 2022-10-21 13:10:08
## 2 DPP
## 3 2.0.6.0
## 4 R version 4.2.0 (2022-04-22)
## 5 CentOS Linux 7 (Core)
## 6 /gpfs/gsfs10/users/wheelerwi/Rlibs/RcometsAnalytics/extdata/cometsInputAge.xlsx
## 7
## 8 Interactive
## 9 spearman
## 10 0.97
## 11 25
## 12 10
## 13 lm
## 14 exposure
## 15 anova
## 16 0.95
## 17 metabolite_name
## 18 xlsx
## 19 none
## 20 0
## 21 0.05
## 22
## 23 *
## 24 *
## 25
##
## attr(,"ptime")
## [1] "Processing time: 0.313 sec"
## attr(,"class")
## [1] "runModel"
Run a linear regression using the glm function for the same variables as above. The default family used with glm is “gaussian”, which corresponds to a linear regression. The Effects data frame will be the same as with lm, but the ModelSummary data frame will contain some different columns.
glm_results <- RcometsAnalytics::runModel(exmodeldata, exmetabdata, "DPP", op=list(model="glm"))
print(all.equal(lm_results$Effects, glm_results$Effects))
## [1] TRUE
print(glm_results$ModelSummary)
## run outcomespec exposurespec converged term wald.pvalue nobs message
## 1 1 lactose age 1 age 0.164512670 1000
## 2 1 lactose age 1 age_grp 0.202526776 1000
## 3 2 lactate age 1 age 0.601499033 1000
## 4 2 lactate age 1 age_grp 0.983098762 1000
## 5 3 lactose bmi_grp 1 bmi_grp 0.023879815 1000
## 6 3 lactose bmi_grp 1 age_grp 0.000878276 1000
## 7 4 lactate bmi_grp 1 bmi_grp 0.001662820 1000
## 8 4 lactate bmi_grp 1 age_grp 0.620725533 1000
## adjvars adjvars.removed adjspec outcome_uid outcome
## 1 age_grp.2 age_grp.3 age_grp HMDB00186 Alpha-Lactose
## 2 age_grp.2 age_grp.3 age_grp HMDB00186 Alpha-Lactose
## 3 age_grp.2 age_grp.3 age_grp HMDB00190 L-Lactic acid
## 4 age_grp.2 age_grp.3 age_grp HMDB00190 L-Lactic acid
## 5 age_grp.2 age_grp.3 age_grp HMDB00186 Alpha-Lactose
## 6 age_grp.2 age_grp.3 age_grp HMDB00186 Alpha-Lactose
## 7 age_grp.2 age_grp.3 age_grp HMDB00190 L-Lactic acid
## 8 age_grp.2 age_grp.3 age_grp HMDB00190 L-Lactic acid
## exposure_uid exposure adj_uid metabolite_name
## 1 age Age at Entry age_grp.2 age_grp.3 lactose
## 2 age Age at Entry age_grp.2 age_grp.3 lactose
## 3 age Age at Entry age_grp.2 age_grp.3 lactate
## 4 age Age at Entry age_grp.2 age_grp.3 lactate
## 5 bmi_grp bmi_grp age_grp.2 age_grp.3 lactose
## 6 bmi_grp bmi_grp age_grp.2 age_grp.3 lactose
## 7 bmi_grp bmi_grp age_grp.2 age_grp.3 lactate
## 8 bmi_grp bmi_grp age_grp.2 age_grp.3 lactate
Call getModelData() to define a model which adjusts for age group, has nested_case as the outcome variable, and has lactose and lactate as the exposure variables. The variable nested_case must be a binary 0-1 variable.
exmodeldata <- RcometsAnalytics::getModelData(exmetabdata,modelspec="Interactive", adjvars="age_grp",
outcomes="nested_case", exposures=c("lactose","lactate"))
To run a logistic regression, the list of options op must also include a model.options list with family set to “binomial”.
op <- list(model="glm", model.options=list(family="binomial"))
glm_results <- RcometsAnalytics::runModel(exmodeldata, exmetabdata, "DPP", op=op)
print(glm_results)
## $ModelSummary
## run outcomespec exposurespec converged term wald.pvalue nobs message
## 1 1 nested_case lactose 1 lactose 0.51856182 1000
## 2 1 nested_case lactose 1 age_grp 0.71229325 1000
## 3 2 nested_case lactate 1 lactate 0.01003264 1000
## 4 2 nested_case lactate 1 age_grp 0.63652620 1000
## adjvars adjvars.removed adjspec outcome_uid outcome
## 1 age_grp.2 age_grp.3 age_grp nested_case nested_case
## 2 age_grp.2 age_grp.3 age_grp nested_case nested_case
## 3 age_grp.2 age_grp.3 age_grp nested_case nested_case
## 4 age_grp.2 age_grp.3 age_grp nested_case nested_case
## exposure_uid exposure adj_uid metabolite_name
## 1 HMDB00186 Alpha-Lactose age_grp.2 age_grp.3 lactose
## 2 HMDB00186 Alpha-Lactose age_grp.2 age_grp.3 lactose
## 3 HMDB00190 L-Lactic acid age_grp.2 age_grp.3 lactate
## 4 HMDB00190 L-Lactic acid age_grp.2 age_grp.3 lactate
##
## $Effects
## run outcomespec exposurespec term estimate std.error statistic
## 1 1 nested_case lactose lactose 0.02268456 0.03513914 0.6455639
## 2 2 nested_case lactate lactate 0.57214513 0.22221799 2.5747022
## pvalue exp.estimate exp.std.error estimate.lower estimate.upper
## 1 0.51856182 1.022944 0.03594537 -0.04618689 0.09155601
## 2 0.01003264 1.772064 0.39378456 0.13660588 1.00768438
## exp.estimate.lower exp.estimate.upper metabolite_name
## 1 0.9548635 1.095878 lactose
## 2 1.1463763 2.739251 lactate
##
## $Errors_Warnings
## [1] type object message
## <0 rows> (or 0-length row.names)
##
## $Table1
## variable in.model type category n n.outcomeEqual0
## 1 nested_case outcome continuous 1000 502
## 2 age_grp adjustment categorical 1 271 142
## 3 2 565 280
## 4 3 164 80
## n.outcomeEqual1 n.unique min quartile1 median mean quartile3 max n.missing
## 1 498 2 0 0 0 0.498 1 1 0
## 2 129 NA NA NA NA NA NA NA NA
## 3 285 NA NA NA NA NA NA NA NA
## 4 84 NA NA NA NA NA NA NA NA
##
## $Info
## name
## 1 date
## 2 cohort
## 3 RcometsAnalytics version
## 4 R version
## 5 operating system
## 6 input file
## 7 model name
## 8 run type
## 9 op$check.cor.method
## 10 op$check.cor.cutoff
## 11 op$check.nsubjects
## 12 op$max.nstrata
## 13 op$model
## 14 op$output.Effects
## 15 op$output.ModelSummary
## 16 op$output.ci_alpha
## 17 op$output.metab.cols
## 18 op$output.type
## 19 op$output.merge
## 20 op$chemEnrich
## 21 op$chemEnrich.adjPvalue
## 22 model.options$family
## 23 model.options$link
## 24 outcome
## 25 exposure
## 26 strata
## value
## 1 2022-10-21 13:10:08
## 2 DPP
## 3 2.0.6.0
## 4 R version 4.2.0 (2022-04-22)
## 5 CentOS Linux 7 (Core)
## 6 /gpfs/gsfs10/users/wheelerwi/Rlibs/RcometsAnalytics/extdata/cometsInputAge.xlsx
## 7
## 8 Interactive
## 9 spearman
## 10 0.97
## 11 25
## 12 10
## 13 glm
## 14 exposure
## 15 anova
## 16 0.95
## 17 metabolite_name
## 18 xlsx
## 19 none
## 20 0
## 21 0.05
## 22 binomial
## 23 logit
## 24 nested_case
## 25 *
## 26
##
## attr(,"ptime")
## [1] "Processing time: 0.285 sec"
## attr(,"class")
## [1] "runModel"
Call getModelData() to define a model which adjusts for age group, has n_visits as the outcome variable, and has lactose and lactate as the exposure variables. The variable n_visits must be a non-negative integer valued variable.
exmodeldata <- RcometsAnalytics::getModelData(exmetabdata,modelspec="Interactive", adjvars="age_grp",
outcomes="n_visits", exposures=c("lactose","lactate"))
To run a Poisson regression, the list of options op must also include a model.options list with family set to “poisson”.
op <- list(model="glm", model.options=list(family="poisson"))
poisson_results <- RcometsAnalytics::runModel(exmodeldata, exmetabdata, "DPP", op=op)
print(poisson_results)
## $ModelSummary
## run outcomespec exposurespec converged term wald.pvalue nobs message
## 1 1 n_visits lactose 1 lactose 0.6105400 1000
## 2 1 n_visits lactose 1 age_grp 0.3528081 1000
## 3 2 n_visits lactate 1 lactate 0.8271571 1000
## 4 2 n_visits lactate 1 age_grp 0.3719447 1000
## adjvars adjvars.removed adjspec outcome_uid outcome exposure_uid
## 1 age_grp.2 age_grp.3 age_grp n_visits n_visits HMDB00186
## 2 age_grp.2 age_grp.3 age_grp n_visits n_visits HMDB00186
## 3 age_grp.2 age_grp.3 age_grp n_visits n_visits HMDB00190
## 4 age_grp.2 age_grp.3 age_grp n_visits n_visits HMDB00190
## exposure adj_uid metabolite_name
## 1 Alpha-Lactose age_grp.2 age_grp.3 lactose
## 2 Alpha-Lactose age_grp.2 age_grp.3 lactose
## 3 L-Lactic acid age_grp.2 age_grp.3 lactate
## 4 L-Lactic acid age_grp.2 age_grp.3 lactate
##
## $Effects
## run outcomespec exposurespec term estimate std.error statistic
## 1 1 n_visits lactose lactose 0.00624383 0.01225956 0.5093028
## 2 2 n_visits lactate lactate 0.01679890 0.07693599 0.2183491
## pvalue estimate.lower estimate.upper metabolite_name
## 1 0.6105400 -0.01778447 0.03027213 lactose
## 2 0.8271571 -0.13399287 0.16759068 lactate
##
## $Errors_Warnings
## [1] type object message
## <0 rows> (or 0-length row.names)
##
## $Table1
## variable in.model type category n n.unique min quartile1 median
## 1 n_visits outcome continuous 1000 6 0 1 2
## 2 age_grp adjustment categorical 1 271 NA NA NA NA
## 3 2 565 NA NA NA NA
## 4 3 164 NA NA NA NA
## mean quartile3 max n.missing
## 1 2.037 3 5 0
## 2 NA NA NA NA
## 3 NA NA NA NA
## 4 NA NA NA NA
##
## $Info
## name
## 1 date
## 2 cohort
## 3 RcometsAnalytics version
## 4 R version
## 5 operating system
## 6 input file
## 7 model name
## 8 run type
## 9 op$check.cor.method
## 10 op$check.cor.cutoff
## 11 op$check.nsubjects
## 12 op$max.nstrata
## 13 op$model
## 14 op$output.Effects
## 15 op$output.ModelSummary
## 16 op$output.ci_alpha
## 17 op$output.metab.cols
## 18 op$output.type
## 19 op$output.merge
## 20 op$chemEnrich
## 21 op$chemEnrich.adjPvalue
## 22 model.options$family
## 23 model.options$link
## 24 outcome
## 25 exposure
## 26 strata
## value
## 1 2022-10-21 13:10:08
## 2 DPP
## 3 2.0.6.0
## 4 R version 4.2.0 (2022-04-22)
## 5 CentOS Linux 7 (Core)
## 6 /gpfs/gsfs10/users/wheelerwi/Rlibs/RcometsAnalytics/extdata/cometsInputAge.xlsx
## 7
## 8 Interactive
## 9 spearman
## 10 0.97
## 11 25
## 12 10
## 13 glm
## 14 exposure
## 15 anova
## 16 0.95
## 17 metabolite_name
## 18 xlsx
## 19 none
## 20 0
## 21 0.05
## 22 poisson
## 23 log
## 24 n_visits
## 25 *
## 26
##
## attr(,"ptime")
## [1] "Processing time: 0.294 sec"
## attr(,"class")
## [1] "runModel"
Call getModelData() to define a model which adjusts for age group, has event as the outcome variable, time as the time-to-event variable, and has lactose and lactate as the exposure variables. The variable event must be binary (0-1), and variable time must be positive.
exmodeldata <- RcometsAnalytics::getModelData(exmetabdata,modelspec="Interactive", adjvars="age_grp",
outcomes="event", timevar="time", exposures=c("lactose","lactate"))
To run a survival model, the list of options op must also include a model.options list with model set to “coxph”.
op <- list(model="coxph")
coxph_results <- RcometsAnalytics::runModel(exmodeldata, exmetabdata, "DPP", op=op)
print(coxph_results)
## $ModelSummary
## run outcomespec exposurespec term wald.pvalue nobs message
## 1 1 event lactose lactose 0.7526633 1000
## 2 1 event lactose age_grp 0.3581715 1000
## 3 2 event lactate lactate 0.6910693 1000
## 4 2 event lactate age_grp 0.3754580 1000
## adjvars adjvars.removed adjspec outcome_uid outcome exposure_uid
## 1 age_grp.2 age_grp.3 age_grp event event HMDB00186
## 2 age_grp.2 age_grp.3 age_grp event event HMDB00186
## 3 age_grp.2 age_grp.3 age_grp event event HMDB00190
## 4 age_grp.2 age_grp.3 age_grp event event HMDB00190
## exposure adj_uid metabolite_name
## 1 Alpha-Lactose age_grp.2 age_grp.3 lactose
## 2 Alpha-Lactose age_grp.2 age_grp.3 lactose
## 3 L-Lactic acid age_grp.2 age_grp.3 lactate
## 4 L-Lactic acid age_grp.2 age_grp.3 lactate
##
## $Effects
## run outcomespec exposurespec term estimate std.error statistic
## 1 1 event lactose lactose 0.01104201 0.03503959 0.3151295
## 2 2 event lactate lactate 0.08702160 0.21897494 0.3974044
## pvalue exp.estimate exp.std.error estimate.lower estimate.upper
## 1 0.7526633 1.011103 0.03542864 -0.05763432 0.07971834
## 2 0.6910693 1.090920 0.23888419 -0.34216140 0.51620459
## exp.estimate.lower exp.estimate.upper metabolite_name
## 1 0.9439951 1.082982 lactose
## 2 0.7102336 1.675656 lactate
##
## $Errors_Warnings
## [1] type object message
## <0 rows> (or 0-length row.names)
##
## $Table1
## variable in.model type category n n.outcomeEqual0 n.outcomeEqual1
## 1 event outcome continuous 1000 749 251
## 2 age_grp adjustment categorical 1 271 200 71
## 3 2 565 420 145
## 4 3 164 129 35
## 5 time time continuous 1000 749 251
## n.unique min quartile1 median mean quartile3 max n.missing
## 1 2 0 0 0 0.251 1.00 1 0
## 2 NA NA NA NA NA NA NA NA
## 3 NA NA NA NA NA NA NA NA
## 4 NA NA NA NA NA NA NA NA
## 5 49 0 11 24 23.828 35.25 48 0
##
## $Info
## name
## 1 date
## 2 cohort
## 3 RcometsAnalytics version
## 4 R version
## 5 operating system
## 6 input file
## 7 model name
## 8 run type
## 9 op$check.cor.method
## 10 op$check.cor.cutoff
## 11 op$check.nsubjects
## 12 op$max.nstrata
## 13 op$model
## 14 op$output.Effects
## 15 op$output.ModelSummary
## 16 op$output.ci_alpha
## 17 op$output.metab.cols
## 18 op$output.type
## 19 op$output.merge
## 20 op$chemEnrich
## 21 op$chemEnrich.adjPvalue
## 22 model.options$ties
## 23 model.options$family
## 24 outcome
## 25 exposure
## 26 strata
## value
## 1 2022-10-21 13:10:09
## 2 DPP
## 3 2.0.6.0
## 4 R version 4.2.0 (2022-04-22)
## 5 CentOS Linux 7 (Core)
## 6 /gpfs/gsfs10/users/wheelerwi/Rlibs/RcometsAnalytics/extdata/cometsInputAge.xlsx
## 7
## 8 Interactive
## 9 spearman
## 10 0.97
## 11 25
## 12 10
## 13 coxph
## 14 exposure
## 15 anova
## 16 0.95
## 17 metabolite_name
## 18 xlsx
## 19 none
## 20 0
## 21 0.05
## 22 efron
## 23 binomial
## 24 event
## 25 *
## 26
##
## attr(,"ptime")
## [1] "Processing time: 0.295 sec"
## attr(,"class")
## [1] "runModel"
Call getModelData() to define a model which adjusts for age group, has nested_case as the outcome variable, matchedSet as the group variable, and has lactose and lactate as the exposure variables. The variable nested_case must be binary (0-1), and variable matchedSet defines the matched sets of groups in the data.
exmodeldata <- RcometsAnalytics::getModelData(exmetabdata,modelspec="Interactive", adjvars="age_grp",
outcomes="nested_case", groupvar="matchedSet", exposures=c("lactose","lactate"))
To run a survival model, the list of options op must also include a model.options list with model set to “clogit”.
op <- list(model="clogit")
clogit_results <- RcometsAnalytics::runModel(exmodeldata, exmetabdata, "DPP", op=op)
print(clogit_results$ModelSummary)
## run outcomespec exposurespec term wald.pvalue nobs message
## 1 1 nested_case lactose lactose 0.51379179 1000
## 2 1 nested_case lactose age_grp 0.68081085 1000
## 3 2 nested_case lactate lactate 0.01093298 1000
## 4 2 nested_case lactate age_grp 0.63719144 1000
## adjvars adjvars.removed adjspec outcome_uid outcome
## 1 age_grp.2 age_grp.3 age_grp nested_case nested_case
## 2 age_grp.2 age_grp.3 age_grp nested_case nested_case
## 3 age_grp.2 age_grp.3 age_grp nested_case nested_case
## 4 age_grp.2 age_grp.3 age_grp nested_case nested_case
## exposure_uid exposure adj_uid metabolite_name
## 1 HMDB00186 Alpha-Lactose age_grp.2 age_grp.3 lactose
## 2 HMDB00186 Alpha-Lactose age_grp.2 age_grp.3 lactose
## 3 HMDB00190 L-Lactic acid age_grp.2 age_grp.3 lactate
## 4 HMDB00190 L-Lactic acid age_grp.2 age_grp.3 lactate
print(clogit_results$Effects)
## run outcomespec exposurespec term estimate std.error statistic
## 1 1 nested_case lactose lactose 0.02218697 0.03397984 0.6529449
## 2 2 nested_case lactate lactate 0.57189296 0.22472705 2.5448337
## pvalue exp.estimate exp.std.error estimate.lower estimate.upper
## 1 0.51379179 1.022435 0.03474218 -0.0444123 0.08878624
## 2 0.01093298 1.771617 0.39813036 0.1314360 1.01234988
## exp.estimate.lower exp.estimate.upper metabolite_name
## 1 0.9565595 1.092847 lactose
## 2 1.1404650 2.752060 lactate
All models desginated in the input file can be run with one command, and individual output Excel files or correlation results will be written in the current directory by default. The function returns a list of objects.
allresults <- RcometsAnalytics::runAllModels(exmetabdata,writeTofile=TRUE)
sessionInfo()
## R version 4.2.0 (2022-04-22)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: CentOS Linux 7 (Core)
##
## Matrix products: default
## BLAS/LAPACK: /usr/local/intel/compilers_and_libraries_2020.2.254/linux/mkl/lib/intel64_lin/libmkl_rt.so
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] nlme_3.1-157 lubridate_1.8.0 webshot_0.5.3
## [4] RColorBrewer_1.1-3 httr_1.4.4 tools_4.2.0
## [7] backports_1.4.1 bslib_0.4.0 ISwR_2.0-8
## [10] utf8_1.2.2 R6_2.5.1 rpart_4.1.16
## [13] DBI_1.1.3 lazyeval_0.2.2 colorspace_2.0-3
## [16] nnet_7.3-17 withr_2.5.0 mnormt_2.1.0
## [19] tidyselect_1.1.2 gridExtra_2.3 compiler_4.2.0
## [22] cli_3.4.0 TSP_1.2-1 plotly_4.10.0
## [25] labeling_0.4.2 sass_0.4.2 scales_1.2.1
## [28] psych_2.2.5 stringr_1.4.1 digest_0.6.29
## [31] rmarkdown_2.16 pkgconfig_2.0.3 htmltools_0.5.3
## [34] parallelly_1.32.1 fastmap_1.1.0 readxl_1.4.1
## [37] htmlwidgets_1.5.4 rlang_1.0.5 rstudioapi_0.14
## [40] farver_2.1.1 jquerylib_0.1.4 generics_0.1.3
## [43] jsonlite_1.8.0 crosstalk_1.2.0 dendextend_1.16.0
## [46] dplyr_1.0.10 ModelMetrics_1.2.2.2 magrittr_2.0.3
## [49] Matrix_1.5-1 Rcpp_1.0.9 munsell_0.5.0
## [52] fansi_1.0.3 viridis_0.6.2 lifecycle_1.0.2
## [55] stringi_1.7.8 pROC_1.18.0 yaml_2.3.5
## [58] MASS_7.3-57 plyr_1.8.7 recipes_1.0.1
## [61] grid_4.2.0 parallel_4.2.0 listenv_0.8.0
## [64] ppcor_1.1 lattice_0.20-45 splines_4.2.0
## [67] knitr_1.40 pillar_1.8.1 corpcor_1.6.10
## [70] future.apply_1.9.1 reshape2_1.4.4 codetools_0.2-18
## [73] stats4_4.2.0 glue_1.6.2 evaluate_0.16
## [76] data.table_1.14.2 vctrs_0.4.1 foreach_1.5.2
## [79] cellranger_1.1.0 gtable_0.3.1 purrr_0.3.4
## [82] tidyr_1.2.1 future_1.28.0 heatmaply_1.3.0
## [85] assertthat_0.2.1 cachem_1.0.6 ggplot2_3.3.6
## [88] xfun_0.33 gower_1.0.0 prodlim_2019.11.13
## [91] RcometsAnalytics_2.0.7.0 broom_1.0.1 class_7.3-20
## [94] survival_3.3-1 viridisLite_0.4.1 timeDate_4021.104
## [97] seriation_1.3.6 tibble_3.1.8 iterators_1.0.14
## [100] registry_0.5-1 hardhat_1.2.0 lava_1.6.10
## [103] globals_0.16.1 ellipsis_0.3.2 caret_6.0-93
## [106] subselect_0.15.3 ipred_0.9-13